DTSC 5301 | Final Project Report

Team: Energy

Authors¶

  • Rona Guo, rugu3582@colorado.edu
  • Sasi Jyothirmai Bonu, sabo8713@colorado.edu
  • Shivam Pandey, shpa5426@colorado.edu
  • Tushar Sharma, tush4938@colorado.edu
  • Veronica Martinez, veronica.martinez@colorado.edu
  • Wyett Considine, wyco0384@colorado.edu

Table Of Contents¶

  1. Introduction
  2. Questions
  3. Data Source
  4. Analysis
  5. Conclusion
  6. Citations

Introduction¶

Climate change is the main issue in the public discourse on energy. A climate catastrophe puts our current well-being, the well-being of those who will follow us, and the surrounding natural world in jeopardy. Many international treaties have been signed with the hopes (ambitions?) of reducing CO2 emissions. However, large-scale alternatives to fossil fuels that are secure, affordable, and low-carbon are still lacking in the world. Our analysis seeks to study the make-up of energy consumption by source over the past 5 decades, to quantitatively evaluate the complaince by region, and to assess the total energy consumption by region.

Questions¶

  • How has global per capita energy consumption changed over time?
  • How does the renewable vs. non-renewable makeup of this energy consumption compare?

Data Source¶

Data was accessed from Our World in Data. The dataset used in this analysis contains energy consumption per capita, measured by kWh per person per year (1965-2022) broken out by country and energy source. This data was originally sourced from the U.S. Energy Information Administration, Energy Institute Statistical Review of World Energy, Gapminder (v7), United Nations, World Population Prospects, HYDE (v3.2), Gapminder (Systema Globalis). Using five decades of energy data should provide a nice historical sample to begin investigating answers to our questions.

Analysis¶

To analyze the data, we used Python to read in our dataset, clean the data, and visualize the data to make comparisons. Below is the source code along with an explanation of steps taken and interpretations of the results.

Initialization¶

First, we installed and imported the required libraries to parse, clean, and plot data.

In [2]:
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# If using Jupyter in VSC:
# import plotly.io as pio
# pio.renderers.default='notebook'

Cleaning the Data¶

After downloading the dataset, we cleaned the data to prepare it for analysis. This included renaming the column headers for improved readability and aggregating the data into two categories, renewable and non-renewable energy sources. We also calculated total energy consumption for each category as well as an overall total of energy consumption to create a plot for comparison.

In [3]:
#Get data
energyDF = pd.read_csv('https://docs.google.com/spreadsheets/d/e/2PACX-1vSKFSOeC0K8NxNq_0ILnFa8bQguWi8QZopcNYLXLpc5NbXCffx5TV5rmPhSjnaT4aa0hoYlbXwoBY53/pub?gid=970132790&single=true&output=csv',
                    header = 0)

#Note: the __PerCap is measured in kWh or kWh equivelants
energyDF.rename(columns={'Entity':'Entity', 'Code':'EntCode', 'Year':'Year', 'Coal per capita (kWh)':'CoalPerCap',
       'Oil per capita (kWh)':'OilPerCap', 'Gas per capita (kWh)':'GasPerCap',
       'Nuclear per capita (kWh - equivalent)':'NuclearPerCap',
       'Hydro per capita (kWh - equivalent)':'HydroPerCap',
       'Wind per capita (kWh - equivalent)':'WindPerCap',
       'Solar per capita (kWh - equivalent)':'SolarPerCap',
       'Other renewables per capita (kWh - equivalent)':'OtherRenewablesPerCap'}, inplace=True)

energyDF.drop('EntCode', axis=1, inplace=True)
energyDF = energyDF.fillna(0)

renewables = ['NuclearPerCap', 'HydroPerCap', 'WindPerCap', 'SolarPerCap','OtherRenewablesPerCap']
nonrenewables = ['CoalPerCap', 'OilPerCap', 'GasPerCap']

print(energyDF.head(2))
   Entity  Year  CoalPerCap  OilPerCap  GasPerCap  NuclearPerCap  HydroPerCap  \
0  Africa  1965   1006.3736  1064.3536  29.777592            0.0    127.91771   
1  Africa  1966    980.1728  1123.7390  32.452050            0.0    139.12254   

   WindPerCap  SolarPerCap  OtherRenewablesPerCap  
0         0.0          0.0                    0.0  
1         0.0          0.0                    0.0  
In [4]:
energyDF['TotalEnerygyPerCap'] = energyDF[renewables].sum(axis=1) + energyDF[nonrenewables].sum(axis=1)
energyDF['TotalNonRenewablePerCap'] = energyDF[nonrenewables].sum(axis=1)
energyDF['TotalRenewablePerCap'] = energyDF[renewables].sum(axis=1)
energyDF.head(5)
Out[4]:
Entity Year CoalPerCap OilPerCap GasPerCap NuclearPerCap HydroPerCap WindPerCap SolarPerCap OtherRenewablesPerCap TotalEnerygyPerCap TotalNonRenewablePerCap TotalRenewablePerCap
0 Africa 1965 1006.37360 1064.3536 29.777592 0.0 127.91771 0.0 0.0 0.0 2228.422502 2100.504792 127.91771
1 Africa 1966 980.17280 1123.7390 32.452050 0.0 139.12254 0.0 0.0 0.0 2275.486390 2136.363850 139.12254
2 Africa 1967 976.73170 1091.7719 31.268766 0.0 141.57660 0.0 0.0 0.0 2241.348966 2099.772366 141.57660
3 Africa 1968 990.00665 1125.0367 30.886887 0.0 161.39375 0.0 0.0 0.0 2307.323987 2145.930237 161.39375
4 Africa 1969 973.52277 1118.1862 35.162052 0.0 183.53688 0.0 0.0 0.0 2310.407902 2126.871022 183.53688

To visualize a comparison between renewable and non-renewable energy consumption over the past 50 years, we opted for a stacked bar chart. First we aggregated the data by decade so that the data can fit nicely on a plot.

In [5]:
# extract only data from 1970 and after.
energyDF = energyDF[energyDF['Year']>=1970]

# create a new column containing the decade for each data point
energyDF['Decade'] = energyDF['Year'] - (energyDF['Year']%10)

# group data by decade
df_decade = energyDF.groupby(by=['Decade','Entity']).sum().reset_index()

df_decade = df_decade[df_decade['Decade']!=2020]

Visualizations¶

With the data restructured for analysis, we then created a stacked bar chart of total world energy consumption per capita per decade for five decades. From the chart we can quickly see that renewable energy consumption has increased over that past 50 years. During the same time period, non-renewable energy consumption decreased in the 80s and 90s, but then increased more than renewable energy consumption did in the 2000s and 2010s.

In [6]:
# Total world energy consumption (Renewable vs Non-Renewable), bar chart

# Create a new datafram with only 'World' values and sort by decade.
df_world = df_decade[df_decade['Entity']=='World']
df_world = df_world.sort_values(by='Decade')

# Create a customized plot to compare renewable vs. non-renewable consumption
fig = go.Figure()

# fig.add_trace(go.Scatter(x=df_world['Decade'].sort_values(),
#                          y=df_world['TotalEnerygyPerCap']/1000,
#                          name='Total Energy',
#                          marker=dict(color='black', size=8),
#                          line=dict(width=3)
#                          ))

fig.add_trace(go.Bar(x=df_world['Decade'].sort_values(),
                     y=df_world['TotalRenewablePerCap']/1000, #convert to MWh
                     name='Reneweable Enerygy',
                     marker=dict(color='#009e60')
                     ))

fig.add_trace(go.Bar(x=df_world['Decade'].sort_values(),
                     y=df_world['TotalNonRenewablePerCap']/1000, #convert to MWh
                     name='Non-Renewable Energy',
                     marker=dict(color='#ff6700')
                     ))

fig.update_layout(autosize=False,
                  width=1000,
                  height=700,
                  barmode='stack',
                  bargap=0.7,
                  xaxis_title=dict(text='Decade'),
                  yaxis_title=dict(text='Energy Consumption Per Cap (MWh)'),
                  title=dict(text='World Per Capita Energy Consumption and Source Over the Past 5 Decades'))

fig.show()

We also looked at data from the United States. The United States is one of the largest energy users in the world, so we wanted to see what energy sources make up this consumption and if renewable energy usage is increasing or decreasing. This is important because climate change is accelerating and where energy is sourced impacts the outcome of the next few decades.

What we see from the bar chart is that the United States uses less energy per capita in this decade than in the previous four decades. Additionally, the United States is using less non-renewable energy than in the 1970s and has increased renewable energy consumption over the years. However, the per capita energy consumption is still significantly larger than what we saw globally at around 800 MWh vs. ~200 MWh.

It's important to note that the total global consumption we saw in the first chart includes the United States. It would be a better comparison to see how the United States compares to the rest of the world.

In [7]:
# Total USA energy consumption (Renewable vs Non-Renewable), Bar Chart

# Create a new datafram with only 'United States' values and sort by decade.
df_US = df_decade[df_decade['Entity']=='United States']
df_US = df_US.sort_values(by='Decade')

# Create a customized plot to compare renewable vs. non-renewable consumption
fig = go.Figure()

# fig.add_trace(go.Scatter(x=df_US['Decade'].sort_values(),
#                          y=df_US['TotalEnerygyPerCap']/1000,
#                          name='Total Energy',
#                          marker=dict(color='black', size=8),
#                          line=dict(width=3)
#                          ))

fig.add_trace(go.Bar(x=df_US['Decade'].sort_values(),
                     y=df_US['TotalRenewablePerCap']/1000,  #convert to MWh
                     name='Reneweable Enerygy',
                     marker=dict(color='#009e60')
                     ))

fig.add_trace(go.Bar(x=df_US['Decade'].sort_values(),
                     y=df_US['TotalNonRenewablePerCap']/1000, #convert to MWh
                     name='Non-Renewable Energy',
                     marker=dict(color='#ff6700')
                     ))

fig.update_layout(autosize=False,
                  width=1000,
                  height=700,
                  barmode='stack',
                  bargap=0.7,
                  xaxis_title=dict(text='Decade'),
                  yaxis_title=dict(text='Energy Consumption Per Cap (MWh)'),
                  title=dict(text='United States Per Capita Energy Consumption and Source Over the Past 5 Decades'))

fig.show()

Next we looked at regional data...

Potentail biases - Regions contain a mix of wealthy and less affluent countries. These dynamics and impacts to the data are not understandable just looking at regional totals...Additional analysis comparing regions that are similiar to each other could provide a more telling story...

In [8]:
opec_countries = ['Algeria','Iran','Iraq','Kuwait','Saudi Arabia','United Arab Emirates','Venezuela']

us_and_canada = ['United States', 'Canada']

region_considered = ['Asia', 'European Union (27)', 'OPEC Countries', 'US & Canada']

df_region = df_decade[(df_decade['Entity'].isin(['Asia', 'European Union (27)']+opec_countries+us_and_canada))]

df_region['Entity'] = df_region['Entity'].apply(lambda x: 'OPEC' if x in opec_countries else x)

df_region['Entity'] = df_region['Entity'].apply(lambda x: 'US & Canada' if x in us_and_canada else x)
C:\Users\thisi\AppData\Local\Temp\ipykernel_23452\4213121682.py:9: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

C:\Users\thisi\AppData\Local\Temp\ipykernel_23452\4213121682.py:11: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

In [9]:
fig = make_subplots(rows=2,
                    cols=2,
                    subplot_titles=("US & Canada", "European Union", "Asia", "OPEC Countries"))

fig.add_trace(go.Bar(x=df_region[df_region['Entity']=='US & Canada']['Decade'].sort_values(),
                     y=df_region[df_region['Entity']=='US & Canada']['TotalRenewablePerCap']/1000,
                     name='Reneweable Enerygy',
                     marker=dict(color='#009e60')
                     ), row=1, col=1)

fig.add_trace(go.Bar(x=df_region[df_region['Entity']=='US & Canada']['Decade'].sort_values(),
                     y=df_region[df_region['Entity']=='US & Canada']['TotalNonRenewablePerCap']/1000,
                     name='Non-Renewable Energy',
                     marker=dict(color='#ff6700')
                     ), row=1, col=1)

fig.add_trace(go.Bar(x=df_region[df_region['Entity']=='European Union (27)']['Decade'].sort_values(),
                     y=df_region[df_region['Entity']=='European Union (27)']['TotalRenewablePerCap']/1000,
                    #  name='Reneweable Enerygy',
                     marker=dict(color='#009e60',),
                     showlegend = False
                     ), row=1, col=2)

fig.add_trace(go.Bar(x=df_region[df_region['Entity']=='European Union (27)']['Decade'].sort_values(),
                     y=df_region[df_region['Entity']=='European Union (27)']['TotalNonRenewablePerCap']/1000,
                    #  name='Non-Renewable Energy',
                     marker=dict(color='#ff6700'),
                     showlegend = False
                     ), row=1, col=2)

fig.add_trace(go.Bar(x=df_region[df_region['Entity']=='Asia']['Decade'].sort_values(),
                     y=df_region[df_region['Entity']=='Asia']['TotalRenewablePerCap']/1000,
                    #  name='Reneweable Enerygy',
                     marker=dict(color='#009e60'),
                     showlegend = False
                     ), row=2, col=1)

fig.add_trace(go.Bar(x=df_region[df_region['Entity']=='Asia']['Decade'].sort_values(),
                     y=df_region[df_region['Entity']=='Asia']['TotalNonRenewablePerCap']/1000,
                    #  name='Non-Renewable Energy',
                     marker=dict(color='#ff6700'),
                     showlegend = False
                     ), row=2, col=1)

fig.add_trace(go.Bar(x=df_region[df_region['Entity']=='OPEC']['Decade'].sort_values(),
                     y=df_region[df_region['Entity']=='OPEC']['TotalRenewablePerCap']/1000,
                    #  name='Reneweable Enerygy',
                     marker=dict(color='#009e60'),
                     showlegend = False
                     ), row=2, col=2)

fig.add_trace(go.Bar(x=df_region[df_region['Entity']=='OPEC']['Decade'].sort_values(),
                     y=df_region[df_region['Entity']=='OPEC']['TotalNonRenewablePerCap']/1000,
                    #  name='Non-Renewable Energy',
                     marker=dict(color='#ff6700'),
                     showlegend = False
                     ), row=2, col=2)

fig.update_traces(marker_line_width=0)

fig.update_layout(height=900,
                  width=1000,
                  bargap=0.4,
                  barmode='stack',
                  title_text="Energy Consumption by Region (MWh)",
                  xaxis=dict(tickvals = df_region['Decade'].unique()),
                  xaxis2=dict(tickvals = df_region['Decade'].unique()),
                  xaxis3=dict(tickvals = df_region['Decade'].unique()),
                  xaxis4=dict(tickvals = df_region['Decade'].unique())
                  )

fig.show()

When all four charts are compared, it is abundantly clear that the US and Canadian region along with the OPEC countries consume around 5 to 10 times the energy consumed per person in the European Union, wich comprises of around 27 countries, and almost around 13 to 26 times the energy consumed per person in the European region. It also shows that OPEC countries are the highest consumers of non-renewable energy sources.

Among the regions best utilizing their non-renewable energy sources, which are US and Canada and the European Union, renewable energy in the 2010 decade only made around 17% and 33%, respectively, of the total energy consumed per capita.

In [10]:
energyDF_region = energyDF[(energyDF['Entity'].isin(['Asia', 'European Union (27)']+opec_countries+us_and_canada))]

energyDF_region['Entity'] = energyDF_region['Entity'].apply(lambda x: 'OPEC' if x in opec_countries else x)

energyDF_region['Entity'] = energyDF_region['Entity'].apply(lambda x: 'US & Canada' if x in us_and_canada else x)
C:\Users\thisi\AppData\Local\Temp\ipykernel_23452\4114039978.py:3: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

C:\Users\thisi\AppData\Local\Temp\ipykernel_23452\4114039978.py:5: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

In [11]:
fig = make_subplots(rows=2,
                    cols=2,
                    subplot_titles=("US & Canada", "European Union", "Asia", "OPEC Countries"))

fig.add_trace(go.Scatter(x=energyDF_region[energyDF_region['Entity']=='US & Canada']['Year'].sort_values(),
                     y=energyDF_region[energyDF_region['Entity']=='US & Canada']['TotalEnerygyPerCap']/1000,
                     name='Total Enerygy',
                     marker=dict(color='black')
                     ), row=1, col=1)

fig.add_trace(go.Scatter(x=energyDF_region[energyDF_region['Entity']=='European Union (27)']['Year'].sort_values(),
                     y=energyDF_region[energyDF_region['Entity']=='European Union (27)']['TotalEnerygyPerCap']/1000,
                    #  name='Reneweable Enerygy',
                     marker=dict(color='black',),
                     showlegend = False
                     ), row=1, col=2)

fig.add_trace(go.Scatter(x=energyDF_region[energyDF_region['Entity']=='Asia']['Year'].sort_values(),
                     y=energyDF_region[energyDF_region['Entity']=='Asia']['TotalEnerygyPerCap']/1000,
                    #  name='Reneweable Enerygy',
                     marker=dict(color='black'),
                     showlegend = False
                     ), row=2, col=1)

fig.add_trace(go.Scatter(x=energyDF_region[energyDF_region['Entity']=='OPEC']['Year'].sort_values(),
                     y=energyDF_region[energyDF_region['Entity']=='OPEC']['TotalEnerygyPerCap']/1000,
                    #  name='Reneweable Enerygy',
                     marker=dict(color='black'),
                     showlegend = False
                     ), row=2, col=2)

fig.update_traces(marker_line_width=0)

fig.update_layout(height=900,
                  width=1000,
                  bargap=0.4,
                #   barmode='stack',
                  title_text="Total Energy Consumption by Region (MWh)",
                  xaxis=dict(tickvals = df_region['Decade'].unique()),
                  xaxis2=dict(tickvals = df_region['Decade'].unique()),
                  xaxis3=dict(tickvals = df_region['Decade'].unique()),
                  xaxis4=dict(tickvals = df_region['Decade'].unique())
                  )

fig.show()

By region (US and Canada, OPEC countries, Asia, and the European Union), the line graphs display the trend in total energy use per person during the previous five decades. The US and Canada have essentially followed the same pattern as the European Union, however the US and Canada have experienced greater swings than the EU. Energy consumption peaked per person in the US and Canada in 1985 and 1988 at about 118 MWh. In contrast, the EU's energy consumption peaked at 43 MWh in 2004 and 2006. Only 36% of what was used at its height in the US and Canada was used by the European Union. The tendency appears to be declining, which is good news.

With having a turbulent graph, OPEC nations have turned out to be the worst energy consumers. In fact, the OPEC countries recently saw the highest-ever energy consumption of 209 MWh in 2014, which is the highest any region has ever seen. Nevertheless, since 2014, there has been a sharp decline in energy use, and in 2022, they used less energy per person than the US&Canada and the European Union.

Asian energy consumption per person initially appears to be rising practically continuously, which would be concerning. However, even at its peak (19 MWh in 2022), Asian energy consumption per capita is still less than that of the EU and the US & Canada. Asia, though, may surpass other regions if the trend persists.

Conclusion¶

Before the analysis, we posed two questions.

  • How has global per capita energy consumption changed over time?
  • How does the renewable vs. non-renewable makeup of this energy consumption compare?

In order to draw a conclusion, we will attempt to present a summary of the solutions based on the analysis in the report.

Based on the Data Visualizations created we have observed the following.

  • The global per capita energy consumption has seen a drastic increase over the decade and its trajectory has been influenced by various factors, including technological advancements, economic development, population growth, and shifts in energy sources. Due the these advancements, the demand for energy has seen steady growth with no room for slowing down as of yet. Energy consumption will always increase amongst a civilization as we strive to move forward. this is fueled by our desire to improve our lives and make the right changes to increase the quality of life.

  • Up until 1990, the amount of total energy utilized per person remained basically constant. Since 1990, the average person's energy usage has been rising at a 7%-per-decade rate. The proportion of renewable energy, on the other hand, has been rising relatively flatly at a pace of about 1% every decade with Asia seeing the most growth in its energy consumption when compared to the rest of the regions. This increase in demand has only encountered 1 major issue which is which energy source is utilized. This debates for the fight between renewable and Non-renewable energy sources

  • The rise of renewable sources of energy and the environmental movement has made the consumers more Eco-friendly. Efforts to combat climate change and reduce greenhouse gas emissions are driving the global transition toward a greater reliance on renewable energy sources. However, the pace and success of this transition depend on numerous factors, including political will, economic considerations, and technological advancements. In some countries and regions, the transition to renewables has been more pronounced, with a growing percentage of electricity coming from wind, solar, and hydroelectric power. the OPEC countries have not yet reached Sustainability therefore they rely more on non-renewable sources of energy. this is caused due to a lack of knowledge and awareness which in turn causes harm to both the environment and the consumers. The most amount of growth in the renewable energy sector has been observed in the region of the US & Canada with an average of 2% growth per decade with Europe being the next biggest region to increase its growth by 3% per-decade.

Citations¶

  1. Hannah Ritchie, Max Roser and Pablo Rosado (2022) - "Energy". Published online at OurWorldInData.org. Retrieved from: 'https://ourworldindata.org/energy' [Online Resource]

  2. Data from Feenstra et al. (2015) Penn World Table v10.0 via Our World in Data.

  • Feenstra, Robert C., Robert Inklaar and Marcel P. Timmer (2015), “The Next Generation of the Penn World Table” American Economic Review, 105(10), 3150-3182, available for download at www.ggdc.net/pwt.

  • Max Roser (2013) – “Economic Growth”. Published online at OurWorldInData.org. Retrieved from: ‘https://ourworldindata.org/economic-growth’ [Online Resource]

In [ ]:
 
In [ ]: